智能论文笔记

Acceleration of Subspace Learning Machine via Particle Swarm Optimization and Parallel Processing

Hongyu Fu , Yijing Yang , Yuhuai Liu , Joseph Lin , Ethan Harrison , Vinod K. Mishra , C. -C. Jay Kuo

分类：机器学习

2022-08-15

基于决策树（DT）的分类和回归思想，最近提议在总体分类和回归任务中提供更高的性能。以更高的计算复杂性为代价，达到了其性能的改进。在这项工作中，我们研究了两种加速SLM的方法。首先，我们采用粒子群优化（PSO）算法来加快对当前尺寸的线性组合表示的判别尺寸的搜索。线性组合中最佳权重的搜索在计算上很重。它是通过原始SLM中的概率搜索来完成的。 PSO的SLM加速需要减少10-20倍的迭代。其次，我们利用SLM实施中的并行处理。实验结果表明，加速的SLM方法在训练时间中达到577的速度系数，同时保持原始SLM的可比分类/回归性能。

translated by 谷歌翻译

Skeletal Video Anomaly Detection using Deep Learning: Survey, Challenges and Future Directions

Pratik K. Mishra , Alex Mihailidis , Shehroz S. Khan

分类：计算机视觉

2022-12-31

The existing methods for video anomaly detection mostly utilize videos containing identifiable facial and appearance-based features. The use of videos with identifiable faces raises privacy concerns, especially when used in a hospital or community-based setting. Appearance-based features can also be sensitive to pixel-based noise, straining the anomaly detection methods to model the changes in the background and making it difficult to focus on the actions of humans in the foreground. Structural information in the form of skeletons describing the human motion in the videos is privacy-protecting and can overcome some of the problems posed by appearance-based features. In this paper, we present a survey of privacy-protecting deep learning anomaly detection methods using skeletons extracted from videos. We present a novel taxonomy of algorithms based on the various learning approaches. We conclude that skeleton-based approaches for anomaly detection can be a plausible privacy-protecting alternative for video anomaly detection. Lastly, we identify major open research questions and provide guidelines to address them.

translated by 谷歌翻译

Escaping Saddle Points for Effective Generalization on Class-Imbalanced Data

Harsh Rangwani , Sumukh K Aithal , Mayank Mishra , R. Venkatesh Babu

分类：机器学习 | 计算机视觉

2022-12-28

Real-world datasets exhibit imbalances of varying types and degrees. Several techniques based on re-weighting and margin adjustment of loss are often used to enhance the performance of neural networks, particularly on minority classes. In this work, we analyze the class-imbalanced learning problem by examining the loss landscape of neural networks trained with re-weighting and margin-based techniques. Specifically, we examine the spectral density of Hessian of class-wise loss, through which we observe that the network weights converge to a saddle point in the loss landscapes of minority classes. Following this observation, we also find that optimization methods designed to escape from saddle points can be effectively used to improve generalization on minority classes. We further theoretically and empirically demonstrate that Sharpness-Aware Minimization (SAM), a recent technique that encourages convergence to a flat minima, can be effectively used to escape saddle points for minority classes. Using SAM results in a 6.2\% increase in accuracy on the minority classes over the state-of-the-art Vector Scaling Loss, leading to an overall average increase of 4\% across imbalanced datasets. The code is available at: https://github.com/val-iisc/Saddle-LongTail.

translated by 谷歌翻译

Privacy-Protecting Behaviours of Risk Detection in People with Dementia using Videos

Pratik K. Mishra , Andrea Iaboni , Bing Ye , Kristine Newman , Alex Mihailidis , Shehroz S. Khan

分类：计算机视觉

2022-12-20

People living with dementia often exhibit behavioural and psychological symptoms of dementia that can put their and others' safety at risk. Existing video surveillance systems in long-term care facilities can be used to monitor such behaviours of risk to alert the staff to prevent potential injuries or death in some cases. However, these behaviours of risk events are heterogeneous and infrequent in comparison to normal events. Moreover, analyzing raw videos can also raise privacy concerns. In this paper, we present two novel privacy-protecting video-based anomaly detection approaches to detect behaviours of risks in people with dementia. We either extracted body pose information as skeletons and use semantic segmentation masks to replace multiple humans in the scene with their semantic boundaries. Our work differs from most existing approaches for video anomaly detection that focus on appearance-based features, which can put the privacy of a person at risk and is also susceptible to pixel-based noise, including illumination and viewing direction. We used anonymized videos of normal activities to train customized spatio-temporal convolutional autoencoders and identify behaviours of risk as anomalies. We show our results on a real-world study conducted in a dementia care unit with patients with dementia, containing approximately 21 hours of normal activities data for training and 9 hours of data containing normal and behaviours of risk events for testing. We compared our approaches with the original RGB videos and obtained an equivalent area under the receiver operating characteristic curve performance of 0.807 for the skeleton-based approach and 0.823 for the segmentation mask-based approach. This is one of the first studies to incorporate privacy for the detection of behaviours of risks in people with dementia.

translated by 谷歌翻译

Assistive Completion of Agrammatic Aphasic Sentences: A Transfer Learning Approach using Neurolinguistics-based Synthetic Dataset

Rohit Misra , Sapna S Mishra , Tapan K. Gandhi

分类：自然语言处理

2022-11-10

Damage to the inferior frontal gyrus (Broca's area) can cause agrammatic aphasia wherein patients, although able to comprehend, lack the ability to form complete sentences. This inability leads to communication gaps which cause difficulties in their daily lives. The usage of assistive devices can help in mitigating these issues and enable the patients to communicate effectively. However, due to lack of large scale studies of linguistic deficits in aphasia, research on such assistive technology is relatively limited. In this work, we present two contributions that aim to re-initiate research and development in this field. Firstly, we propose a model that uses linguistic features from small scale studies on aphasia patients and generates large scale datasets of synthetic aphasic utterances from grammatically correct datasets. We show that the mean length of utterance, the noun/verb ratio, and the simple/complex sentence ratio of our synthetic datasets correspond to the reported features of aphasic speech. Further, we demonstrate how the synthetic datasets may be utilized to develop assistive devices for aphasia patients. The pre-trained T5 transformer is fine-tuned using the generated dataset to suggest 5 corrected sentences given an aphasic utterance as input. We evaluate the efficacy of the T5 model using the BLEU and cosine semantic similarity scores. Affirming results with BLEU score of 0.827/1.00 and semantic similarity of 0.904/1.00 were obtained. These results provide a strong foundation for the concept that a synthetic dataset based on small scale studies on aphasia can be used to develop effective assistive technology.

translated by 谷歌翻译

Suppressing Noise from Built Environment Datasets to Reduce Communication Rounds for Convergence of Federated Learning

Rahul Mishra , Hari Prabhat Gupta , Tanima Dutta , Sajal K. Das

分类：机器学习

2022-09-03

Smart Sensing提供了一种更轻松，方便的数据驱动机制，用于在建筑环境中监视和控制。建筑环境中生成的数据对隐私敏感且有限。 Federated Learning是一个新兴的范式，可在多个参与者之间提供隐私的合作，以进行模型培训，而无需共享私人和有限的数据。参与者数据集中的嘈杂标签降低了表现，并增加了联合学习收敛的通信巡回赛数量。如此大的沟通回合需要更多的时间和精力来训练模型。在本文中，我们提出了一种联合学习方法，以抑制每个参与者数据集中嘈杂标签的不平等分布。该方法首先估计每个参与者数据集的噪声比，并使用服务器数据集将噪声比归一化。所提出的方法可以处理服务器数据集中的偏差，并最大程度地减少其对参与者数据集的影响。接下来，我们使用每个参与者的归一化噪声比和影响来计算参与者的最佳加权贡献。我们进一步得出表达式，以估计提出方法收敛所需的通信回合数。最后，实验结果证明了拟议方法对现有技术的有效性，从交流回合和在建筑环境中实现了性能。

translated by 谷歌翻译

Part-of-Speech Tagging of Odia Language Using statistical and Deep Learning-Based Approaches

Tusarkanta Dalai , Tapas Kumar Mishra , Pankaj K Sa

分类：自然语言处理

2022-07-07

自动言论（POS）标记是许多自然语言处理（NLP）任务的预处理步骤，例如名称实体识别（NER），语音处理，信息提取，单词sense sisse disampigation和Machine Translation。它已经在英语和欧洲语言方面取得了令人鼓舞的结果，但是使用印度语言，尤其是在Odia语言中，由于缺乏支持工具，资源和语言形态丰富性，因此尚未得到很好的探索。不幸的是，我们无法为ODIA找到一个开源POS标记，并且仅尝试为ODIA语言开发POS标记器的尝试。这项研究工作的主要贡献是介绍有条件的随机场（CRF）和基于深度学习的方法（CNN和双向长期短期记忆）来开发ODIA的语音部分。我们使用了一个公开访问的语料库，并用印度标准局（BIS）标签设定了数据集。但是，全球的大多数语言都使用了带有通用依赖项（UD）标签集注释的数据集。因此，要保持统一性，odia数据集应使用相同的标签集。因此，我们已经构建了一个从BIS标签集到UD标签集的简单映射。我们对CRF模型进行了各种特征集输入，观察到构造特征集的影响。基于深度学习的模型包括BI-LSTM网络，CNN网络，CRF层，角色序列信息和预训练的单词向量。通过使用卷积神经网络（CNN）和BI-LSTM网络提取角色序列信息。实施了神经序列标记模型的六种不同组合，并研究了其性能指标。已经观察到具有字符序列特征和预训练的单词矢量的BI-LSTM模型取得了显着的最新结果。

translated by 谷歌翻译

Implicit Channel Learning for Machine Learning Applications in 6G Wireless Networks

Ahmet M. Elbir , Wei Shi , Kumar Vijay Mishra , Anastasios K. Papazafeiropoulos , Symeon Chatzinotas

分类：人工智能 | 机器学习

2022-06-24

随着第五代（5G）无线系统在全球范围内收集动力的部署，6G的可能技术正在积极的研究讨论下。特别是，机器学习（ML）在6G中的作用有望增强和帮助新兴应用，例如虚拟和增强现实，车辆自治和计算机视觉。这将导致大量的无线数据流量包括图像，视频和语音。 ML算法通过位于云服务器上的学习模型来处理这些分类/识别/估计。这需要将数据从边缘设备无线传输到云服务器。与识别步骤分开处理的渠道估计对于准确的学习绩效至关重要。为了结合通道和ML数据的学习，我们引入了隐式渠道学习以执行ML任务而不估计无线通道。在这里，ML模型通过通道腐败的数据集训练，代替名义数据。没有通道估计，该提出的方法在各种情况（例如毫米波和IEEE 802.11p车辆通道）方面的图像和语音分类任务上显示了大约60％的改善。

translated by 谷歌翻译

A Closer Look at Smoothness in Domain Adversarial Training

Harsh Rangwani , Sumukh K Aithal , Mayank Mishra , Arihant Jain , R. Venkatesh Babu

分类：机器学习 | 计算机视觉

2022-06-16

域对抗训练无处不在地实现不变表示，并广泛用于各种域适应任务。近来，融合到平滑最佳的方法已显示出对分类等监督学习任务的改进的概括。在这项工作中，我们分析了增强配方对域对抗训练的影响，其目的是任务损失（例如分类，回归等）和对抗性术语的组合。我们发现，相对于（W.R.T.）任务损失融合了平滑的最小值，可以稳定对抗性训练，从而在目标域上获得更好的性能。与任务损失相反，我们的分析表明，融合到平滑的最小W.R.T.对抗损失导致目标结构域的次级概括。基于分析，我们介绍了平滑的域对抗训练（SDAT）程序，该程序有效地增强了现有域对抗方法的性能，以进行分类和对象检测任务。我们的分析还提供了对社区中亚当（Adam）对域名对抗训练的广泛使用的洞察力。

translated by 谷歌翻译

Computing the Performance of A New Adaptive Sampling Algorithm Based on The Gittins Index in Experiments with Exponential Rewards

James K. He , Sofía S. Villar , Lida Mavrogonatou

分类：机器学习

2023-01-03

Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.

translated by 谷歌翻译